The recent neural implicit representation-based methods have greatly advanced the state of the art for solving the long-standing and challenging problem of reconstructing a discrete surface from a sparse point cloud. These methods generally learn either a binary occupancy or signed/unsigned distance field (SDF/UDF) as surface representation. However, all the existing SDF/UDF-based methods use neural networks to implicitly regress the distance in a purely data-driven manner, thus limiting the accuracy and generalizability to some extent. In contrast, we propose the first geometry-guided method for UDF and its gradient estimation that explicitly formulates the unsigned distance of a query point as the learnable affine averaging of its distances to the tangent planes of neighbouring points. Besides, we model the local geometric structure of the input point clouds by explicitly learning a quadratic polynomial for each point. This not only facilitates upsampling the input sparse point cloud but also naturally induces unoriented normal, which further augments UDF estimation. Finally, to extract triangle meshes from the predicted UDF we propose a customized edge-based marching cube module. We conduct extensive experiments and ablation studies to demonstrate the significant advantages of our method over state-of-the-art methods in terms of reconstruction accuracy, efficiency, and generalizability. The source code is publicly available at https://github.com/rsy6318/GeoUDF.
translated by 谷歌翻译
时间是时间序列最重要的特征之一,但没有得到足够的关注。先前的时间序列预测研究主要集中于将过去的子序列(查找窗口)映射到未来的系列(预测窗口),而系列的时间通常只是在大多数情况下都扮演辅助角色。由于这些窗口中的点处理,将其推断为长期未来在模式上是艰难的。为了克服这一障碍,我们提出了一个名为DateFormer的全新时间序列预测框架,他将注意力转移到建模时间上,而不是遵循上述实践。具体而言,首先按时间序列分为补丁,以监督通过Transformers(DERT)的日期编码器表示的动态日期代表的学习。然后将这些表示形式馈入一个简单的解码器,以产生更粗的(或全局)预测,并用于帮助模型从回顾窗口中寻求有价值的信息,以学习精致(或本地)的预测。 DateFormer通过将上述两个部分求和来获得最终结果。我们对七个基准测试的经验研究表明,与序列建模方法相比,时间模型方法对于长期序列预测更有效。 DateFormer产生最先进的准确性,相对改进40%,并将最大可靠的预测范围扩大到半年水平。
translated by 谷歌翻译
经常在现实生活中观察到签名网络,并具有与每个边缘相关的其他符号信息,但是在现有网络模型中,这些信息在很大程度上被忽略了。本文开发了一个统一的嵌入模型,用于签名网络,以解散交织在一起的平衡结构和异常效应,这可以极大地促进下游分析,包括社区检测,异常检测和网络推断。所提出的模型通过低等级加稀疏基质分解捕获平衡结构和异常效应,这些分解是通过正则配方共同估算的。它的理论保证是根据渐近一致性和用于网络嵌入,社区检测和异常检测的有限样本概率范围的。还通过在合成网络和国际关系网络上进行广泛的数值实验来证明所提出的嵌入模型的优势。
translated by 谷歌翻译
定向的非循环图(DAG)模型广泛用于表示许多应用域中随机变量之间的因果关系。本文研究了一类特殊的非高斯DAG模型,其中每个节点给予父母的条件方差是其条件平均值的二次函数。这样一类非高斯DAG模型相当灵活,并承认许多流行的分布为特殊情况,包括泊松,二项式,几何,指数和伽马。为了促进学习,我们介绍了一种新颖的拓扑层的概念,并开发了一种高效的DAG学习算法。它首先以分层方式重建拓扑层,然后在不同层中的节点之间恢复定向边缘,这需要比文献中的大多数现有算法更少的计算成本。它的优点也在许多模拟示例中展示,以及其应用于两个现实生活数据集,包括NBA播放器统计数据和阿里巴巴收集的化妆品销售数据。
translated by 谷歌翻译
经常被描绘为定向的非循环图(DAG)的非环状模型已被广泛用于代表收集节点之间的定向因果关系。在本文中,我们提出了一种高效的方法来学习高尺寸案例中的线性非高斯表达,其中噪音可以是任何连续的非高斯分布。这与假设高斯噪声具有额外方差假设的高斯噪声以获得确切的DAG恢复的高斯噪声,这与大多数现有的DAG学习方法形成鲜明对比。该方法利用新颖的拓扑层概念来促进DAG学习。特别地,我们表明,拓扑层可以精确地以自下而上的方式重建,并且每个层中的节点之间的父子关系也可以一致地建立。更重要的是,拟议的方法不需要忠诚或父母的忠诚假设,这在DAG学习的文献中得到了广泛的假设。其优势也得到了各种模拟示例中的一些流行竞争对手的数值比较以及关于Covid-19的全球扩散的真实应用。
translated by 谷歌翻译
The past few years have witnessed the prevalence of self-supervised representation learning within the language and 2D vision communities. However, such advancements have not been fully migrated to the community of 3D point cloud learning. Different from previous pre-training pipelines for 3D point clouds that generally fall into the scope of either generative modeling or contrastive learning, in this paper, we investigate a translative pre-training paradigm, namely PointVST, driven by a novel self-supervised pretext task of cross-modal translation from an input 3D object point cloud to its diverse forms of 2D rendered images (e.g., silhouette, depth, contour). Specifically, we begin with deducing view-conditioned point-wise embeddings via the insertion of the viewpoint indicator, and then adaptively aggregate a view-specific global codeword, which is further fed into the subsequent 2D convolutional translation heads for image generation. We conduct extensive experiments on common task scenarios of 3D shape analysis, where our PointVST shows consistent and prominent performance superiority over current state-of-the-art methods under diverse evaluation protocols. Our code will be made publicly available.
translated by 谷歌翻译
Deep learning-based 3D object detectors have made significant progress in recent years and have been deployed in a wide range of applications. It is crucial to understand the robustness of detectors against adversarial attacks when employing detectors in security-critical applications. In this paper, we make the first attempt to conduct a thorough evaluation and analysis of the robustness of 3D detectors under adversarial attacks. Specifically, we first extend three kinds of adversarial attacks to the 3D object detection task to benchmark the robustness of state-of-the-art 3D object detectors against attacks on KITTI and Waymo datasets, subsequently followed by the analysis of the relationship between robustness and properties of detectors. Then, we explore the transferability of cross-model, cross-task, and cross-data attacks. We finally conduct comprehensive experiments of defense for 3D detectors, demonstrating that simple transformations like flipping are of little help in improving robustness when the strategy of transformation imposed on input point cloud data is exposed to attackers. Our findings will facilitate investigations in understanding and defending the adversarial attacks against 3D object detectors to advance this field.
translated by 谷歌翻译
Point clouds are characterized by irregularity and unstructuredness, which pose challenges in efficient data exploitation and discriminative feature extraction. In this paper, we present an unsupervised deep neural architecture called Flattening-Net to represent irregular 3D point clouds of arbitrary geometry and topology as a completely regular 2D point geometry image (PGI) structure, in which coordinates of spatial points are captured in colors of image pixels. \mr{Intuitively, Flattening-Net implicitly approximates a locally smooth 3D-to-2D surface flattening process while effectively preserving neighborhood consistency.} \mr{As a generic representation modality, PGI inherently encodes the intrinsic property of the underlying manifold structure and facilitates surface-style point feature aggregation.} To demonstrate its potential, we construct a unified learning framework directly operating on PGIs to achieve \mr{diverse types of high-level and low-level} downstream applications driven by specific task networks, including classification, segmentation, reconstruction, and upsampling. Extensive experiments demonstrate that our methods perform favorably against the current state-of-the-art competitors. We will make the code and data publicly available at https://github.com/keeganhk/Flattening-Net.
translated by 谷歌翻译
Directly training a document-to-document (Doc2Doc) neural machine translation (NMT) via Transformer from scratch, especially on small datasets usually fails to converge. Our dedicated probing tasks show that 1) both the absolute position and relative position information gets gradually weakened or even vanished once it reaches the upper encoder layers, and 2) the vanishing of absolute position information in encoder output causes the training failure of Doc2Doc NMT. To alleviate this problem, we propose a position-aware Transformer (P-Transformer) to enhance both the absolute and relative position information in both self-attention and cross-attention. Specifically, we integrate absolute positional information, i.e., position embeddings, into the query-key pairs both in self-attention and cross-attention through a simple yet effective addition operation. Moreover, we also integrate relative position encoding in self-attention. The proposed P-Transformer utilizes sinusoidal position encoding and does not require any task-specified position embedding, segment embedding, or attention mechanism. Through the above methods, we build a Doc2Doc NMT model with P-Transformer, which ingests the source document and completely generates the target document in a sequence-to-sequence (seq2seq) way. In addition, P-Transformer can be applied to seq2seq-based document-to-sentence (Doc2Sent) and sentence-to-sentence (Sent2Sent) translation. Extensive experimental results of Doc2Doc NMT show that P-Transformer significantly outperforms strong baselines on widely-used 9 document-level datasets in 7 language pairs, covering small-, middle-, and large-scales, and achieves a new state-of-the-art. Experimentation on discourse phenomena shows that our Doc2Doc NMT models improve the translation quality in both BLEU and discourse coherence. We make our code available on Github.
translated by 谷歌翻译
Speech-to-speech translation directly translates a speech utterance to another between different languages, and has great potential in tasks such as simultaneous interpretation. State-of-art models usually contains an auxiliary module for phoneme sequences prediction, and this requires textual annotation of the training dataset. We propose a direct speech-to-speech translation model which can be trained without any textual annotation or content information. Instead of introducing an auxiliary phoneme prediction task in the model, we propose to use bottleneck features as intermediate training objectives for our model to ensure the translation performance of the system. Experiments on Mandarin-Cantonese speech translation demonstrate the feasibility of the proposed approach and the performance can match a cascaded system with respect of translation and synthesis qualities.
translated by 谷歌翻译